Parallel Three-sequence Alignment with Space-efficient
نویسندگان
چکیده
The sequence alignment is a fundamental problem in computational biology. Many sequence alignment methods have been proposed, such as pair-wise sequence alignment, multiple sequence alignment (MSA), syntenic alignment, constraint multiple sequence alignment, and etc. Among them, pair-wise sequence alignment is a most commonly used method. Recently, MSA is more and more important which has been used to solve many molecular biological problems. The sum-of-pairs MSA problem has been proved to be a NP-complete problem. A progressive algorithm was used to solve MSA problem which is based on the pair-wise sequence alignments. However, this progressive MSA has a problem. This problem may be solved by using three-sequence alignment as a basic step instead of pair-wise sequence alignment. Three-sequence alignment can be solved sequentially in O(mnl) time complexity and O(mn) space complexity, where m, n and l are the lengths of the sequences to be aligned. The complexities of three-sequence alignment limit its applicability. Hence, to reduce the complexities becomes an important issue. In this paper, we propose a parallel algorithm to solve this problem. Our parallel algorithm requires O((mn)/p) space complexity and O((mnl)/p) time complexity, where p is the number of processors. Both experimental results and theoretical analysis are presented. The experimental results show that our parallel algorithm is applicable and achieves a good speed up.
منابع مشابه
gpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences
Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...
متن کاملParallel Sequence Alignment in Limited Space
Sequence comparison with affine gap costs is a problem that is readily parallelizable on simple single-instruction, multiple-data stream (SIMD) parallel processors using only constant space per processing element. Unfortunately, the twin problem of sequence alignment, finding the optimal character-by-character correspondence between two sequences, is more complicated. While the innovative O(n2)...
متن کاملEfficient Cache - oblivious String Algorithms for Bioinformatics ∗
For each of these three problems we present cache-oblivious algorithms that match the best-known time complexity, match or improve the best-known space complexity, and improve significantly over the cache-efficiency of earlier algorithms. We also show that these algorithms are easily parallelizable, and we analyze their parallel performance. We present experimental results that show that all th...
متن کاملSpace-Efficient Parallel Algorithms for the Constrained Multiple Sequence Alignment Problem
Given sequences S1, S2, . . . Sn, and a pattern string P the constrained multiple sequence alignment problem (CMSA) is to align similar subsequences of these sequences with the constraint that the alignment “contains” P . The CMSA problem can be considered as an optimal path search problem in the dynamic programming matrix. The problem has a dynamic programming solution that requires O(2|S1||S2...
متن کاملParallel Fast Linear Space Alignment
We introduce a new parallel algorithm for pairwise sequence alignment, called Parallel FastLSA, that finds optimal answers, uses linear space with respect to problem size and number of processors, and achieves good speedups in practice. Computational biologists use sequence alignment algorithms to compare sequences of DNA (or proteins). Since the sequences can be long, space efficiency and turn...
متن کامل